The Evolving Field of Distributed Storage in This Issue a Broader View Replication-based Data Archives Distributed Data Storage Introduction and Survey

نویسنده

  • Sumeet Sobti
چکیده

M emory is a fundamental commodity of computation. Storage systems provide memory that costs far less than RAM and has greater persistence, but suffers from much lower data transfer bandwidths and higher access latencies. These attributes — cost, persistence, bandwidth, and latency — are the traditional evaluation metrics for storage systems, but the remarkable growth of communications and networking over the past few decades has complicated this simple picture. Today the network is an integral part of the computer. Most of us routinely access Web pages whose display requires fetching data from dozens of machines around the world. The Web is the first distributed storage system to have such an immediate global impact. It illustrates the technological, economic, and cultural power of a distributed approach. However , the Web's fragility and operational semantics prevent it from addressing the storage problems of mainstream data processing. For example, error messages or suspended display is common when a network or system component fails somewhere, making a page or one of its elements inaccessible. Also, informal caching on the Web makes it hard to be certain that you're viewing current information. A simple form of distributed file storage is widespread now, as many of us routinely and transparently access files stored somewhere else on a local area network. Between LANs and the World Wide Web lies the domain of distributed enterprise-wide storage, an area that industry is now actively developing. In this increasingly complex and demanding world of distributed storage, we are forced to consider new metrics and issues beyond the traditional set. These include shared coherent access, availability , survivability, security, interoperabili-ty, search, caching, load balancing, and scale — the need for storage systems of truly immense proportions. Indeed, our increased appetite for storage has also engendered another design issue: the need to largely automate the now human-intensive task of managing large storage systems. Finally, an undercurrent in the flow of ideas concerns the cultural Introduction and Summary issues of privacy and anonymity in the context of distributed storage. The builders of distributed storage systems face many architectural decisions as they work toward their targets among these metrics and issues. The most basic of these is the question, " Who's doing the work of providing storage services? " Hierarchical approaches, including the new trend toward storage virtualization, use layers of control and abstraction to stitch together distributed and disparate storage providers into a …

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Improving Data Grids Performance by Using Modified Dynamic Hierarchical Replication Strategy

Abstract: A Data Grid connects a collection of geographically distributed computational and storage resources that enables users to share data and other resources. Data replication, a technique much discussed by Data Grid researchers in recent years creates multiple copies of file and places them in various locations to shorten file access times. In this paper, a dynamic data replication strate...

متن کامل

An Efficient Data Replication Strategy in Large-Scale Data Grid Environments Based on Availability and Popularity

The data grid technology, which uses the scale of the Internet to solve storage limitation for the huge amount of data, has become one of the hot research topics. Recently, data replication strategies have been widely employed in distributed environment to copy frequently accessed data in suitable sites. The primary purposes are shortening distance of file transmission and achieving files from ...

متن کامل

Improving Data Availability Using Combined Replication Strategy in Cloud Environment

As grow as the data-intensive applications in cloud computing day after day, data popularity in this environment becomes critical and important. Hence to improve data availability and efficient accesses to popular data, replication algorithms are now widely used in distributed systems. However, most of them only replicate the static number of replicas on some requested chosen sites and it is ob...

متن کامل

E2DR: Energy Efficient Data Replication in Data Grid

Abstract— Data grids are an important branch of gird computing which provide mechanisms for the management of large volumes of distributed data. Energy efficiency has recently emerged as a hot topic in large distributed systems. The development of computing systems is traditionally focused on performance improvements driven by the demand of client's applications in scientific and business domai...

متن کامل

Data Replication-Based Scheduling in Cloud Computing Environment

Abstract— High-performance computing and vast storage are two key factors required for executing data-intensive applications. In comparison with traditional distributed systems like data grid, cloud computing provides these factors in a more affordable, scalable and elastic platform. Furthermore, accessing data files is critical for performing such applications. Sometimes accessing data becomes...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2001